home *** CD-ROM | disk | FTP | other *** search
- NIST SPHERE Header Structure
-
-
- The NIST SPHERE header is an object-oriented, 1024-byte blocked, ASCII
- structure which is prepended to the waveform data. The header is composed
- of a fixed-format portion followed by an object-oriented variable portion.
- The fixed portion is as follows:
-
- NIST_1A<new-line>
- 1024<new-line>
-
- The first line specifies the header type and the second line specifies the
- header length. Each of these lines are 8 bytes long (including new-line) and
- are structured to identify the header as well as allow those who do not wish
- to read the subsequent header information to programmatically skip over it.
-
-
- The remaining object-oriented variable portion is composed of
- object-type-value "triple" lines which have the following format:
-
-
- <LINE> ::= <TRIPLE><new-line> |
- <COMMENT><new-line> |
- <TRIPLE><COMMENT><new-line> |
-
- <TRIPLE> ::= <OBJECT><space><TYPE><space><VALUE><OPT-SPACES>
-
- <OBJECT> ::= <PRIMARY-SUBOBJECT> |
- <PRIMARY-SUBOBJECT><SECONDARY-SUBOBJECT>
-
- <PRIMARY-SUBOBJECT> ::= <ALPHA> | <ALPHA><ALPHA-NUM-STRING>
- <SECONDARY-SUBOBJECT> ::= _<ALPHA-NUM-STRING> |
- _<ALPHA-NUM-STRING><SECONDARY-SUBOBJECT>
-
- <TYPE> ::= -<INTEGER-FLAG> | -<REAL-FLAG> | -<STRING-FLAG>
-
- <INTEGER-FLAG> ::= i
- <REAL-FLAG> ::= r
- <STRING-FLAG> ::= s<DIGIT-STRING>
-
- <VALUE> ::= <INTEGER> | <REAL> | <STRING> (depending on object type)
-
- <INTEGER> ::= <SIGN><DIGIT-STRING>
- <REAL> ::= <SIGN><DIGIT-STRING>.<DIGIT-STRING>
-
- <OPT-SPACES> ::= <SPACES> | NULL
-
- <COMMENT> ::= ;<STRING> (excluding embedded new-lines)
-
- <ALPHA-NUM-STRING> ::= <ALPHA-NUM> | <ALPHA-NUM><ALPHA-NUM-STRING>
- <ALPHA-NUM> ::= <DIGIT> | <ALPHA>
- <ALPHA> ::= a | ... | z | A | ... | Z
- <DIGIT-STRING> ::= <DIGIT> | <DIGIT><DIGIT-STRING>
- <DIGIT> ::= 0 | ... | 9
- <SIGN> ::= + | - | NULL
- <SPACES> ::= <space> | <SPACES><space>
- <STRING> ::= <CHARACTER> | <CHARACTER><STRING>
- <CHARACTER> ::= char(0) | char(1) | ... | char(255)
-
- The currently defined objects (used in this database) are listed
- in the file "stdfield.c". The list may be expanded for future
- databases, since the grammar does not impose any limit on the number
- of objects. The file is simply a repository for "standard" object
- definitions.
-
- The single object "end_head" marks the end of the active header and the
- remaining unused header space is undefined. A sample header is included
- below.
-
- -- John Garofolo
-
-
-
-
- NIST_1A
- 1024
- database_id -s5 TIMIT
- database_version -s3 1.0
- utterance_id -s8 aks0_sa1
- channel_count -i 1
- sample_count -i 63488
- sample_rate -i 16000
- sample_min -i -6967
- sample_max -i 7710
- sample_n_bytes -i 2
- sample_byte_format -s2 01
- sample_sig_bits -i 16
- end_head
-
-
- SPHERE
- NIST SPeech HEader REsources
-
- Release 1.7 (beta)
- June 1991
-
- 0. Introduction:
-
- SPHERE is a software package containing:
- 1. a set of C functions that can be used to:
- a) create and modify NIST speech file headers (in memory)
- b) read (write) NIST speech file headers from (to) disk
- 2. a set of basic utility programs that use the functions
-
- This software has been developed for use within the DARPA speech research
- community. Although care has been taken to ensure that all software is
- complete and bug-free, it is made available to the speech research
- community without endorsement or express or implied warranties.
-
-
-
- 1. Usage:
-
- User programs are linked with the library libsp.a, which contains
- the C functions mentioned above and some other functions that are
- used to support them. Functions that begin with "sp_" are intended
- to be callable by user programs. The semantics of the functions
- in this library are described in a manual page and comments in the
- source code.
-
- All functions in the library that return pointers will return NULL
- pointers on failure/error. All numeric functions will return negative
- values on failure/error.
-
- User programs that call the functions should "#include" the files
- header.h and sp.h. The former contains, among other things, some
- type definitions used by the library functions. The latter contains
- declarations for all user functions in the library.
-
-
-
- 2. Installation:
-
- To install SPHERE on a Unix system, use the Unix utility "make".
- While in the directory where the package source code has been
- installed, type:
-
- make -f makefile.
-
- The speech header library will be created, as well as the sample
- programs. Installation on non-Unix systems without "make" will
- probably require manual compilation.
-
-
-
- 3. Sample Programs:
-
- Several sample programs have been included in this release to
- demonstrate the functionality of the SPHERE header library:
- Manual pages exist for "h_read" and "h_edit". Short descriptions
- of the others follow:
-
- h_add { inputfile | - } { outputfile | - }
- adds a header to the data in inputfile, stores
- the result in outputfile; a dash instead in place
- of inputfile means read from stdin; a dash in
- outputfile means write to stdout
-
- h_strip { inputfile | - } { outputfile | - }
- strips the header from inputfile, stores the
- remaining data in outputfile
-
- h_test
- tests the header routines through an endless
- loop that stores and retrieves values from a header;
- Bugs in the library would hopefully either result
- in an insertion/retrieval error or a memory allocation
- error/failure
-
- h_nlrm
- remove newline characters from fields
-
- h_delete
- delete header fields
-
- The abstract data type that programs use is a pointer to a "header_t"
- structure. This pointer is a handle used by the functions that operate
- on the header, much like a FILE pointer is used as a handle by functions
- in the C "stdio" library that operate on files.
-
- One difference is that there are two different ways to return a
- header pointer to a user program. The first method is to call the
- function that returns a pointer to an empty header. The second
- reads the fields from a speech file into a header.
-
- Another difference is that there is no set limit on the number of
- headers a program can have existing in memory at a given time.
- The C "stdio" library typically limits the number of open files
- to between 20 and 100, because the actual array of FILE structures
- is declared statically. Header structures are allocated dynamically,
- so user programs should deallocate headers that are no longer in
- use to avoid running out of memory.
-
-
-
- 4. Documentation:
-
- The following documentation files are also located in this directory:
-
- changes.doc - list of recent modifications to SPHERE
- disclaim.doc - NIST software disclaimer
- header.doc - description of NIST header structure
- readme.doc - this file
-
- h_read.1 - manual page for the command "h_read"
- h_read.doc - simple text version of "h_read.1"
-
- h_edit.1 - manual page for the command "h_edit"
- h_edit.doc - simple text version of "h_edit.1"
-
- sphere.3 - manual page for the SPHERE function library
- sphere.doc - simple text version of "sphere.3"
-
-
-
- 5. Bug Reports:
-
- Please report any bugs to John Garofolo by sending email to
- john@jaguar.ncsl.nist.gov.
-
- Please include a description of the bug/problem and the hardware
- and software under which the problem occurred, as well as any data
- needed to reproduce the problem.
-
- The most recent version of the SPHERE package is available
- via anonymous ftp from jaguar.ncsl.nist.gov [129.6.48.157] in
- compressed tar form as "sphere-v.tar.Z" (where "v" is the version
- code).
-
- 6. Changes in Release 1.5:
-
- 1. New functions were added to the Sphere library:
- sp_get_fieldnames()
- sp_get_type()
- sp_get_size()
- sp_is_std()
-
- (see the sphere library man page for descriptions)
-
- 2. h_read: command line options were changed
-
- 3. h_strip: writes to stdout if destination is "-"
-
- 4. man page for h_read
-
-
- 7. Changes in Release 1.6:
-
- 1. Utilities that use h_modify.c are now much faster in
- most cases when editing in-place -- if the size
- of the header does not change, the new header is
- copied over the old one.
-
- 2. Modified sp_write_header() to work when writing to
- objects other than files. The function ftell()
- was previously used directly on the output
- stream to ascertain the number of bytes
- in the header; now the header is written
- to a temp file to ascertain the header size,
- then to the output stream.
-
- 3. Modified to sp_open_header() and spx_read_header()
- to no longer test if the input file is at
- position 0. This will allow reading from
- pipes, etc.
-
- 4. h_add: can read from stdin and/or write to stdout;
- no longer puts any dummy fields in the header.
-
-
- 5. h_strip: can now read from stdin in addition to
- writing to stdout.
-
- 6. Added h_header and raw2nist to the Sphere package.
- They are Bourne shell scripts (/bin/sh) to,
- respectively, print file headers and convert raw
- data (no header) to Sphere format.
-
- 7. Manual pages for commands h_edit, h_delete, h_add,
- h_strip and raw2nist
-
- 8. Changes in Release 1.7:
-
- 1. h_read: added "-C field" option to check that the
- specified field(s) is in the headers of all files
- on the command line.
-
-
- --------------------------
- Stan Janet
- stan@jaguar.ncsl.nist.gov
-